skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "David Abel*, John Winder*"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Effective options can make reinforcement learning easier by enhancing an agent’s ability to both ex- plore in a targeted manner and plan further into the future. However, learning an appropriate model of an option’s dynamics in hard, requiring estimat- ing a highly parameterized probability distribution. This paper introduces and motivates the Expected- Length Model (ELM) for options, an alternate model for transition dynamics. We prove ELM is a (biased) estimator of the traditional Multi- Time Model (MTM), but provide a non-vacuous bound on their deviation. We further prove that, in stochastic shortest path problems, ELM induces a value function that is sufficiently similar to the one induced by MTM, and is thus capable of support- ing near-optimal behavior. We explore the practical utility of this option model experimentally, finding consistent support for the thesis that ELM is a suit- able replacement for MTM. In some cases, we find ELM leads to more sample efficient learning, espe- cially when options are arranged in a hierarchy. 
    more » « less